Tag

#reinforcement learning

55 articles

Qwen’s Former Lead on What Hybrid Thinking Got Wrong — and Why He Now Backs Agents

This article explores the limitations of hybrid thinking in AI models and why researchers like Junyang Lin are now advocating for agentic thinking as a more robust and scalable approach.

Jul 422

Airwallex raises $320m at an $11bn valuation, betting on agentic finance

This explainer explores agentic finance, a cutting-edge field where AI agents autonomously manage financial tasks. Learn how reinforcement learning, deep learning, and transformer models enable these systems to make intelligent financial decisions.

Jun 2543

DeepReinforce Releases Ornith-1.0: An Open-Source Coding Model Family That Learns Its Own RL Scaffolds

DeepReinforce has released Ornith-1.0, an open-source coding model that learns its own reinforcement learning scaffolding during training. The 397B parameter flagship model achieved a score of 82.4 on SWE-Bench Verified.

Jun 2522

Patronus AI lands $50M to build ‘digital worlds’ that stress-test AI agents

Learn to build a digital world environment for AI agent testing using Python, reinforcement learning, and PyTorch. This tutorial demonstrates how to create a simulated environment where AI agents navigate obstacles and learn optimal behaviors.

Jun 2540

Prime Intellect Releases prime-rl 0.6.0 to Train Trillion-Parameter MoE Models on Agentic RL Workloads

Prime Intellect releases prime-rl 0.6.0, an open framework for training trillion-parameter Mixture-of-Experts models using asynchronous reinforcement learning.

Jun 2237

The AI world is getting ‘loopy’

This explainer explores the concept of 'looping AI' - a revolutionary approach where multiple AI agents continuously interact in feedback cycles, enabling autonomous self-improvement and adaptation in dynamic environments.

Jun 2224

OpenAI researchers show small doses of "beneficial trait" training make AI models broadly safer and harder to manipulate

OpenAI researchers show that training AI models on small doses of beneficial traits like truthfulness and corrigibility improves safety and performance across domains.

Jun 1946

tech

The best early Prime Day robot vacuum deals I'd buy now, after testing dozens of them

This explainer explores the advanced AI concepts behind modern robot vacuums, including SLAM algorithms, sensor fusion, and reinforcement learning techniques that enable autonomous navigation and adaptive cleaning.

Jun 1819

tech

Amazon has discounted a Lenovo IdeaPad for 73% off, and it's actually worth considering

This article explains how AI-powered dynamic pricing systems work and why they're transforming consumer technology markets. Learn about the machine learning algorithms behind real-time price optimization.

Jun 1623

Andrew Yang thinks the next big startup opportunity is lowering the cost of living

This explainer explores how artificial intelligence can optimize the cost of living by systematically reducing expenses in areas like housing, food, and wireless services through advanced optimization algorithms and machine learning techniques.

Jun 1238

Meet Harness-1: A 20B Retrieval Subagent Trained With Reinforcement Learning Inside a Stateful Search Harness on gpt-oss-20b

Learn to build a stateful search harness system inspired by Harness-1, a 20B parameter retrieval subagent trained with reinforcement learning. This tutorial teaches you to implement candidate pooling, evidence graph maintenance, and reinforcement learning-based decision making.

Jun 625

How C3 AI agents will automate predictive maintenance for Shell

This article explains how autonomous AI agents using reinforcement learning and causal inference are transforming industrial maintenance from reactive to predictive, as demonstrated by Shell's deployment of C3 AI technology.

Jun 539